107 research outputs found
Automatic Search Intervals for the Smoothing Parameter in Penalized Splines
The selection of smoothing parameter is central to the estimation of
penalized splines. The best value of the smoothing parameter is often the one
that optimizes a smoothness selection criterion, such as generalized
cross-validation error (GCV) and restricted likelihood (REML). To correctly
identify the global optimum rather than being trapped in an undesired local
optimum, grid search is recommended for optimization. Unfortunately, the grid
search method requires a pre-specified search interval that contains the
unknown global optimum, yet no guideline is available for providing this
interval. As a result, practitioners have to find it by trial and error. To
overcome such difficulty, we develop novel algorithms to automatically find
this interval. Our automatic search interval has four advantages. (i) It
specifies a smoothing parameter range where the associated penalized least
squares problem is numerically solvable. (ii) It is criterion-independent so
that different criteria, such as GCV and REML, can be explored on the same
parameter range. (iii) It is sufficiently wide to contain the global optimum of
any criterion, so that for example, the global minimum of GCV and the global
maximum of REML can both be identified. (iv) It is computationally cheap
compared with the grid search itself, carrying no extra computational burden in
practice. Our method is ready to use through our recently developed R package
gps (>= version 1.1). It may be embedded in more advanced statistical modeling
methods that rely on penalized splines.Comment: R code is available at
https://github.com/ZheyuanLi/gps-vignettes/blob/main/gps2.pd
Exploring the Cognitive Knowledge Structure of Large Language Models: An Educational Diagnostic Assessment Approach
Large Language Models (LLMs) have not only exhibited exceptional performance
across various tasks, but also demonstrated sparks of intelligence. Recent
studies have focused on assessing their capabilities on human exams and
revealed their impressive competence in different domains. However, cognitive
research on the overall knowledge structure of LLMs is still lacking. In this
paper, based on educational diagnostic assessment method, we conduct an
evaluation using MoocRadar, a meticulously annotated human test dataset based
on Bloom Taxonomy. We aim to reveal the knowledge structures of LLMs and gain
insights of their cognitive capabilities. This research emphasizes the
significance of investigating LLMs' knowledge and understanding the disparate
cognitive patterns of LLMs. By shedding light on models' knowledge, researchers
can advance development and utilization of LLMs in a more informed and
effective manner.Comment: Findings of EMNLP 2023 (Short Paper
A physical neural network training approach toward multi-plane light conversion design
Multi-plane light converter (MPLC) designs supporting hundreds of modes are
attractive in high-throughput optical communications. These photonic structures
typically comprise >10 phase masks in free space, with millions of independent
design parameters. Conventional MPLC design using wavefront matching updates
one mask at a time while fixing the rest. Here we construct a physical neural
network (PNN) to model the light propagation and phase modulation in MPLC,
providing access to the entire parameter set for optimization, including not
only profiles of the phase masks and the distances between them. PNN training
supports flexible optimization sequences and is a superset of existing MPLC
design methods. In addition, our method allows tuning of hyperparameters of PNN
training such as learning rate and batch size. Because PNN-based MPLC is found
to be insensitive to the number of input and target modes in each training
step, we have demonstrated a high-order MPLC design (45 modes) using mini
batches that fit into the available computing resources.Comment: Draft for submission to Optics Expres
SUIT: Learning Significance-guided Information for 3D Temporal Detection
3D object detection from LiDAR point cloud is of critical importance for
autonomous driving and robotics. While sequential point cloud has the potential
to enhance 3D perception through temporal information, utilizing these temporal
features effectively and efficiently remains a challenging problem. Based on
the observation that the foreground information is sparsely distributed in
LiDAR scenes, we believe sufficient knowledge can be provided by sparse format
rather than dense maps. To this end, we propose to learn Significance-gUided
Information for 3D Temporal detection (SUIT), which simplifies temporal
information as sparse features for information fusion across frames.
Specifically, we first introduce a significant sampling mechanism that extracts
information-rich yet sparse features based on predicted object centroids. On
top of that, we present an explicit geometric transformation learning
technique, which learns the object-centric transformations among sparse
features across frames. We evaluate our method on large-scale nuScenes and
Waymo dataset, where our SUIT not only significantly reduces the memory and
computation cost of temporal fusion, but also performs well over the
state-of-the-art baselines.Comment: Accepted to IROS 202
- …